AITopics | perfect generalization

Collaborating Authors

perfect generalization

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Dynamics of Generalization in Linear Perceptrons

Neural Information Processing SystemsApr-6-2023, 19:28:49 GMT

We study the evolution of the generalization ability of a simple linear per(cid:173) ceptron with N inputs which learns to imitate a "teacher perceptron". The system is trained on p aN binary example inputs and the generaliza(cid:173) tion ability measured by testing for agreement with the teacher on all 2N possible binary input patterns. The dynamics may be solved analytically and exhibits a phase transition from imperfect to perfect generalization at a 1. Except at this point the generalization ability approaches its asymptotic value exponentially, with critical slowing down near the tran(cid:173) sition; the relaxation time is ex (1 - y'a)-2. Right at the critical point, 1 the approach to perfect generalization follows a power law ex t - '2.

generalization, generalization ability, linear perceptron, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.66)

Add feedback

Stability is Stable: Connections between Replicability, Privacy, and Adaptive Generalization

Bun, Mark, Gaboardi, Marco, Hopkins, Max, Impagliazzo, Russell, Lei, Rex, Pitassi, Toniann, Sivakumar, Satchit, Sorrell, Jessica

arXiv.org Artificial IntelligenceMar-24-2023

The notion of replicable algorithms was introduced in Impagliazzo et al. [STOC '22] to describe randomized algorithms that are stable under the resampling of their inputs. More precisely, a replicable algorithm gives the same output with high probability when its randomness is fixed and it is run on a new i.i.d. sample drawn from the same distribution. Using replicable algorithms for data analysis can facilitate the verification of published results by ensuring that the results of an analysis will be the same with high probability, even when that analysis is performed on a new data set. In this work, we establish new connections and separations between replicability and standard notions of algorithmic stability. In particular, we give sample-efficient algorithmic reductions between perfect generalization, approximate differential privacy, and replicability for a broad class of statistical problems. Conversely, we show any such equivalence must break down computationally: there exist statistical problems that are easy under differential privacy, but that cannot be solved replicably without breaking public-key cryptography. Furthermore, these results are tight: our reductions are statistically optimal, and we show that any computational separation between DP and replicability must imply the existence of one-way functions. Our statistical reductions give a new algorithmic framework for translating between notions of stability, which we instantiate to answer several open questions in replicability and privacy. This includes giving sample-efficient replicable algorithms for various PAC learning, distribution estimation, and distribution testing problems, algorithmic amplification of $\delta$ in approximate DP, conversions from item-level to user-level privacy, and the existence of private agnostic-to-realizable learning reductions under structured distributions.

algorithm, artificial intelligence, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2303.12921

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.13)
North America > United States > California > Los Angeles County > Long Beach (0.13)
North America > Canada > Ontario > Toronto (0.13)
(21 more...)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.88)
Information Technology > Data Science (0.87)
Information Technology > Security & Privacy (0.87)

Add feedback

Weight Priors for Learning Identity Relations

Radha, Kopparti, Tillman, Weyde

arXiv.org Machine LearningMar-6-2020

Learning abstract and systematic relations has been an open issue in neural network learning for over 30 years. It has been shown recently that neural networks do not learn relations based on identity and are unable to generalize well to unseen data. The Relation Based Pattern (RBP) approach has been proposed as a solution for this problem. In this work, we extend RBP by realizing it as a Bayesian prior on network weights to model the identity relations. This weight prior leads to a modified regularization term in otherwise standard network learning. In our experiments, we show that the Bayesian weight priors lead to perfect generalization when learning identity based relations and do not impede general neural network learning. We believe that the approach of creating an inductive bias with weight priors can be extended easily to other forms of relations and will be beneficial for many other learning tasks.

identity relation, neural network, relation, (15 more...)

arXiv.org Machine Learning

2003.03125

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > United Kingdom (0.04)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Hidden Unit Specialization in Layered Neural Networks: ReLU vs. Sigmoidal Activation

Oostwal, Elisa, Straat, Michiel, Biehl, Michael

arXiv.org Machine LearningOct-16-2019

We study layered neural networks of rectified linear units (ReLU) in a modelling framework for stochastic training processes. The comparison with sigmoidal activation functions is in the center of interest. We compute typical learning curves for shallow networks with K hidden units in matching student teacher scenarios. The systems exhibit sudden changes of the generalization performance via the process of hidden unit specialization at critical sizes of the training set. Surprisingly, our results show that the training behavior of ReLU networks is qualitatively different from that of networks with sigmoidal activations. In networks with K >= 3 sigmoidal hidden units, the transition is discontinuous: Specialized network configurations co-exist and compete with states of poor performance even for very large training sets. On the contrary, the use of ReLU activations results in continuous transitions for all K: For large enough training sets, two competing, differently specialized states display similar generalization abilities, which coincide exactly for large networks in the limit K to infinity.

configuration, neural network, transition, (13 more...)

arXiv.org Machine Learning

1910.07476

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Netherlands (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(7 more...)

Genre: Research Report > New Finding (1.00)

Industry: Education > Educational Setting (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Examples of learning curves from a modified VC-formalism

Kowalczyk, Adam, Szymanski, Jacek, Bartlett, Peter L., Williamson, Robert C.

Neural Information Processing SystemsDec-31-1996

We examine the issue of evaluation of model specific parameters in a modified VC-formalism. Two examples are analyzed: the 2-dimensional homogeneous perceptron and the I-dimensional higher order neuron. Both models are solved theoretically, and their learning curves are compared against true learning curves. It is shown that the formalism has the potential to generate a variety of learning curves, including ones displaying ''phase transitions."

learning curve, phase transition, vc-formalism, (13 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > South Australia (0.04)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.35)

Add feedback

Examples of learning curves from a modified VC-formalism

Kowalczyk, Adam, Szymanski, Jacek, Bartlett, Peter L., Williamson, Robert C.

Neural Information Processing SystemsDec-31-1996

learning curve, phase transition, vc-formalism, (13 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > South Australia (0.04)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.35)

Add feedback

Examples of learning curves from a modified VC-formalism

Kowalczyk, Adam, Szymanski, Jacek, Bartlett, Peter L., Williamson, Robert C.

Neural Information Processing SystemsDec-31-1996

We examine the issue of evaluation of model specific parameters in a modified VC-formalism. Two examples are analyzed: the 2-dimensional homogeneous perceptron and the I-dimensional higher order neuron. Both models are solved theoretically, and their learning curves are compared againsttrue learning curves. It is shown that the formalism has the potential to generate a variety of learning curves, including ones displaying ''phase transitions."

artificial intelligence, learning curve, machine learning, (15 more...)

Neural Information Processing Systems

Country: Oceania > Australia (0.29)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.35)

Add feedback

Dynamics of Generalization in Linear Perceptrons

Krogh, Anders, Hertz, John A.

Neural Information Processing SystemsDec-31-1991

We study the evolution of the generalization ability of a simple linear perceptron with N inputs which learns to imitate a "teacher perceptron". The system is trained on p aN binary example inputs and the generalization ability measured by testing for agreement with the teacher on all 2N possible binary input patterns. The dynamics may be solved analytically and exhibits a phase transition from imperfect to perfect generalization at a 1. Except at this point the generalization ability approaches its asymptotic value exponentially, with critical slowing down near the transition; the relaxation time is ex (1 - y'a)-2.

generalization, generalization ability, perfect generalization, (12 more...)

Neural Information Processing Systems

Country:

Europe > Denmark > Capital Region > Copenhagen (0.05)
Asia > Singapore (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.84)

Add feedback

Dynamics of Generalization in Linear Perceptrons

Krogh, Anders, Hertz, John A.

Neural Information Processing SystemsDec-31-1991

generalization, generalization ability, perfect generalization, (12 more...)

Neural Information Processing Systems

Country:

Europe > Denmark > Capital Region > Copenhagen (0.05)
Asia > Singapore (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.84)

Add feedback

Dynamics of Generalization in Linear Perceptrons

Krogh, Anders, Hertz, John A.

Neural Information Processing SystemsDec-31-1991

We study the evolution of the generalization ability of a simple linear perceptron withN inputs which learns to imitate a "teacher perceptron". The system is trained on p aN binary example inputs and the generalization abilitymeasured by testing for agreement with the teacher on all 2N possible binary input patterns. The dynamics may be solved analytically and exhibits a phase transition from imperfect to perfect generalization at a 1. Except at this point the generalization ability approaches its asymptotic value exponentially, with critical slowing down near the transition; therelaxation time is ex (1 - y'a)-2.

artificial intelligence, generalization, machine learning, (14 more...)

Neural Information Processing Systems

Country: Europe > Denmark (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.84)

Add feedback